An Exploration of Pursuing Professional Explorations

My Data Science Job Search: August - October, 2020

Rohan Lewis

2020.09.29 (Updated 2020.11.03)

I began applying for Information Technology jobs on August 11th, 2020. I used LinkedIn as my primary source of information. I varied between large and small companies, EasyApply and normal applications via company job application portals or website submissions, and various locations across the US. I even applied to a job in Sydney, Australia (by accident) and a job in Toronto, Canada.

I saved my application information in an Excel Spreadsheet as I applied. I decided to use it to practice some visualization techniques.

In [1]:
#General Packages.
from datetime import date
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import re

#Specific Packages
from wordcloud import WordCloud
import bar_chart_race as bcr
from IPython.display import Video
import plotly.graph_objects as go
from pywaffle import Waffle

#Read data.
df = pd.read_excel(r'C:\Users\Rohan\Documents\Data Science\Completed\Jobs\Jobs.xlsx')
lat_lons = pd.read_excel(r'C:\Users\Rohan\Documents\Data Science\Completed\Jobs\uscities.xlsx')

#Convert datetime columns to date.
df['Date_Applied'] = df['Date_Applied'].apply(lambda x: x.date())
df['Rejection_Email'] = df['Rejection_Email'].apply(lambda x: x.date())
df['Viewed_Email'] = df['Viewed_Email'].apply(lambda x: x.date())

#Current Numbers
print("As of " + date.today().strftime("%A, %B %d, %Y") + ", I have applied to " + str(df.shape[0]) + " jobs.")
As of Tuesday, November 03, 2020, I have applied to 556 jobs.

Job Title

This section looks at the specific job titles.

  1. I eliminated all non alphabetical characters.

  2. A recurring theme in job titles was numbering, such as "Data Analyst II" or "Data Scientist III". Since I had already eliminated numbers, I needed to eliminate Roman Numerals (only I, II, and III).

  3. I had some false negatives, so I fixed those.
    1. "AR VR" was changed back to "AR/VR" for one title.
    2. "C C" was changed back to "C2C" for one title.
    3. "Microsoft" was changed to "Microsoft-365" for one title.
    4. "Non IT" was changed back to "Non-IT" for one title.

  4. Words were then tallied.

Word Frequencies

Below are the top 10 and bottom 10 (some of the single occuring) words from the job titles.

In [2]:
#Combine all job titles into one string.
jobs_string = ' '.join(df['Title'])
#Only letters are useful.
regex = re.compile('[^a-zA-Z]')
#Remove all non letters, and remove ' I ', ' II ', ' III '.
jobs_string = regex.sub(' ', jobs_string).replace(' I ',' ').replace(' II ',' ').replace(' III ',' ')
#Specific Replacement.
jobs_string = jobs_string.replace('AR VR','AR/VR').replace('C C', 'C2C').replace('Microsoft', 'Microsoft-365').replace('Non IT', 'Non-IT')

#Create a frequency distribution of all words in the job titles.
jobs_dict = {}
jobs_words = jobs_string.split()
for word in jobs_words :
    if word not in jobs_dict.keys() :
        jobs_dict[word] = 1
    else :
        jobs_dict[word] += 1

#Convert frequency distribution to dataframe, sort by frequency.
jobs_df = pd.DataFrame({'Word' : list(jobs_dict.keys()),
                       'Count' : list(jobs_dict.values())}).sort_values(by = 'Count', ascending = False, axis = 0).reset_index(drop = True)
jobs_df.head(10)
Out[2]:
Word Count
0 Data 509
1 Scientist 321
2 Analyst 133
3 Engineer 78
4 Learning 50
5 Machine 50
6 Analytics 28
7 Python 23
8 Science 21
9 Developer 21
In [3]:
jobs_df.tail(10)
Out[3]:
Word Count
186 Bioinformaticist 1
187 Training 1
188 Card 1
189 Green 1
190 or 1
191 Citizens 1
192 Local 1
193 Author 1
194 Content 1
195 Growth 1

Word Cloud

I thought a word cloud would be a fun representation to look at the job titles. The proportional size has been rescaled.

In [4]:
jobs_wc = WordCloud(background_color = 'white',
                    max_words = 300,
                    collocations = False,
                    relative_scaling = 0)
jobs_wc.generate(jobs_string)
plt.figure(figsize = (15, 11))

plt.imshow(jobs_wc, interpolation = 'bilinear')
plt.axis('off');

Companies

Some companies are hiring heavily. Some are recruiting and staffing agencies for others.

Once my application was in a system on a particular company's career portal page, it was easy to reapply. I used this quite a bit for companies like Amazon, Google, MITRE, and PayPal.

I used LinkedIn's EasyApply for many applications as well.

For the others, I sometimes applied as a guest, sometimes only had to upload my resume and cover, sometimes had to go through a 20 minute ordeal just for one opening. It varied. ¯_( ͡° ͜ʖ ͡°)_/¯

Company Applications by Date

See Appendix for full table of cumulative applications by company and date, sorted alphabetically and chronologically, respectively.

In [5]:
#List of all companies.
companies = df['Company'].unique()
#Dates from first application to today.
date_index = pd.date_range(start = min(df['Date_Applied']), end = date.today())

#Create new data frame of 0s.
application_df = pd.DataFrame(index = date_index, columns = companies).fillna(0)

#Create cumulative count of job applications by company and date.
for i in range(len(df)) :
    company = df.iloc[i, 1]
    date_app = df.iloc[i, 7]
    application_df.loc[date_app:, company] += 1

#Alphabetical
application_df = application_df.reindex(sorted(application_df.columns), axis=1)
In [6]:
#Total number of applications to each company.
cumulative_app_count = application_df.iloc[[-1]]
major_companies = cumulative_app_count.columns[(cumulative_app_count >= 6).iloc[0]]
minor_companies = cumulative_app_count.columns[(cumulative_app_count < 6).iloc[0]].tolist()

#Create a dummy company called 'Other', containing all companies with less than 6 applications.
major_df = pd.DataFrame.copy(application_df)
major_df['Other'] = major_df[minor_companies].sum(axis = 1)
major_df.drop(minor_companies, axis = 1, inplace = True)

#Check number of bars to include.
major_df.shape
Out[6]:
(85, 22)

Bar Chart Race

Below is an animation of the above data. Bar chart races run more smoothly with larger numbers, like population, or monetary amounts, over longer periods of time. But I am happy the way this turned out.

In [7]:
bold_colors = ['#f0f0f0', '#3cb44b', '#e6194b', '#fffac8', '#9a6324', '#e6beff', '#fabebe', '#000075',
               '#ffe119', '#008080', '#4363d8', '#ffffff', '#bcf60c', '#46f0f0', '#911eb4', '#800000',
               '#f032e6', '#808000', '#ffd8b1', '#f58231', '#aaffc3', '#000000', '#ca1699']

apps = bcr.bar_chart_race(df = major_df,
                          filename = 'applications.mp4',
                          orientation = 'h',
                          sort = 'desc',
                          n_bars = 15,
                          cmap = bold_colors[0:22],
                          filter_column_colors = False,
                          period_fmt = '%B %d, %Y',
                          period_label = {'x': 0.99,
                                          'y': 0.26,
                                          'ha': 'right',
                                          'size': 14},
                          period_summary_func = lambda v, r: {'x': 0.99,
                                                              'y': 0.05,
                                                              's': f"{v.sum():,.0f} applications completed.\n\nCompanies with less than 6 applications are grouped in 'Other'.",
                                                              'ha': 'right',
                                                              'size': 9,
                                                              'weight': 'normal'},
                          title = 'Total Number of Jobs Applied to by Company and Date',
                          steps_per_period = 10)

Video("applications.mp4")
Out[7]:

Job Location

Some slight modifications were made from LinkedIn data during the application process:

  1. For Curate Partners, a job location was changed from Raleigh-Durham-Chapel Hill Area to Raleigh.
  2. For Parker & Lynch, a job location was changed from Orange County to Los Angeles.
  3. For Synectics, a job location was changed from Greater Chicago to Chicago.
  4. For several companies, job locations were changed from "Greater" or "Metropolitan" of a city to that city.

In addition, for Common App, the job location was changed from none specified to Arlington. I did not learn about their opening from LinkedIn.

By City

Remote Locations

Several jobs were advertised with no city, only remote.

In [8]:
df[df['City'] == "Remote"]
Out[8]:
Title Company Size City State_abbv State Date_Posted Date_Applied Rejection_Email Viewed_Email CoID JobID URL
195 Data Scientist Pilot Flying J 10001+ Remote NaN NaN 2020-09-04 00:00:00 2020-09-06 2020-09-17 NaT NaN 2.019452e+09 https://www.linkedin.com/jobs/search/?currentJ...
419 Forward Deployed Data Scientist Cresta 11-50 Remote NaN NaN 2020-09-29 00:00:00 2020-09-30 NaT NaT NaN 2.155368e+09 https://www.linkedin.com/jobs/search/?currentJ...
420 Data Scientist White Ops 51-200 Remote NaN NaN > 1 week 2020-09-30 2020-10-15 NaT NaN 2.023661e+09 https://www.linkedin.com/jobs/search/?currentJ...
433 Intern - AI/Machine Learning Seagate Technology 10001+ Remote OR Oregon 2020-09-30 00:00:00 2020-10-01 NaT NaT 201783 2.183372e+09 https://www.linkedin.com/jobs/search/?currentJ...
437 Platform Data Engineer Demyst 51-200 Remote NaN NaN 2020-10-01 00:00:00 2020-10-01 NaT NaT NaN 2.152177e+09 https://www.linkedin.com/jobs/search/?currentJ...
522 Data Scientist Possible 11-50 Remote NaN NaN 2020-10-14 00:00:00 2020-10-15 NaT 2020-10-16 NaN 2.183079e+09 https://www.linkedin.com/jobs/search/?currentJ...

I retrieved their office locations and manually entered them.

In [9]:
#Manual City, State_abbv, and State entry. 
dict_cs = {195: ("Knoxville", "TN", "Tennessee"),
           419: ("San Francisco", "CA", "California"),
           420: ("New York", "NY", "New York"),
           433: ("New York", "NY", "New York"),
           437: ("Portland", "OR", "Oregon"),
           522: ("Seattle", "WA", "Washington")}

#Loop to manually enter missing City, State_abbv, and State entry for Remote locations.
for idx in dict_cs :
    temp_city = dict_cs[idx][0]
    temp_state_abbv = dict_cs[idx][1]
    temp_state = dict_cs[idx][2]
    df.at[idx, 'City'] = temp_city
    df.at[idx, 'State_abbv'] = temp_state_abbv
    df.at[idx, 'State'] = temp_state

Missing Cities

Several jobs I applied to were in cities not in the spreadsheet I downloaded.

In [10]:
#Select relevant columns.
df_loc = df[['City', 'State_abbv', 'State', 'Date_Applied']]

#Select only city, state, and location columns.
lat_lons = lat_lons[['city', 'state_id', 'lat', 'lng']]

#Count the number of applications.
city_tally = df_loc.groupby(['City', 'State_abbv', 'State']).count().reset_index()
#Merge to get latitude, longitude for each city.
merged = pd.merge(city_tally,
                  lat_lons,
                  how = 'left',
                  left_on = ['City', 'State_abbv'],
                  right_on = ['city', 'state_id'])



#Several cities are not in the list.
merged[merged['city'].isna()]
Out[10]:
City State_abbv State Date_Applied city state_id lat lng
9 Bedford MA Massachusetts 4 NaN NaN NaN NaN
22 Bridgewater NJ New Jersey 2 NaN NaN NaN NaN
28 Center Valley PA Pennsylvania 1 NaN NaN NaN NaN
44 Dallas - Ft. Worth TX Texas 5 NaN NaN NaN NaN
91 Patuxent River MD Maryland 1 NaN NaN NaN NaN
125 Sydney AU Australia 1 NaN NaN NaN NaN
128 Toronto CN Canada 1 NaN NaN NaN NaN
130 Utica Rome NY New York 1 NaN NaN NaN NaN
137 Weston MA Massachusetts 1 NaN NaN NaN NaN

Manual Removal and Entry

Two were from Australia and Canada. For the others, I retrieved values from Google. An approximate average value was chosen for Dallas-Ft. Worth.

In [11]:
#Huxley job from Sydney, Australia was removed.
merged = merged[merged.State_abbv != 'AU']
#Prodigy Academy job from Toronto, Canada was removed.
merged = merged[merged.State_abbv != 'CN']

#Manual latitude and longitude entry. 
dict_loc = {9: (42.4906, -71.2760),
            22: (40.5940, -74.6049),
            28: (40.5294, -75.3937),
            44: (32.7598, -97.0646),
            91: (38.2773, -76.4229),
            130: (43.2763, -75.1545),
            137: (42.3668, -71.3031)}

#Loop to manually enter missing latitude and longitudes.
for idx in dict_loc :
    lat = dict_loc[idx][0]
    lon = dict_loc[idx][1]
    merged.at[idx, 'lat'] = lat
    merged.at[idx, 'lng'] = lon

#Rename columns and drop redundant columns    
merged = merged.rename(columns = {'Date_Applied' : 'Count',
                                  'lat': 'Latitude',
                                  'lng': 'Longitude'}).drop(['city', 'state_id'], axis = 1)
merged.shape
Out[11]:
(138, 6)

Chevy Chase

The left join was 12 off. I soon determined that Chevy Chase, MD was repeated for some reason.

In [12]:
merged[merged.City == 'Chevy Chase']
Out[12]:
City State_abbv State Count Latitude Longitude
34 Chevy Chase MD Maryland 1 38.9943 -77.0737
35 Chevy Chase MD Maryland 1 38.9819 -77.0833
In [13]:
merged = merged[(merged['Latitude'] != 38.9819) | (merged['Longitude'] != -77.0833)]

Mountain View

In addition, two cities in California are named Mountain View.

In [14]:
merged[merged.City == 'Mountain View']
Out[14]:
City State_abbv State Count Latitude Longitude
80 Mountain View CA California 11 37.4000 -122.0796
81 Mountain View CA California 11 38.0093 -122.1169

All cities, with their state, number of applications, and location, are displayed below.

In [15]:
merged = merged[(merged['Latitude'] != 38.0093) | (merged['Longitude'] != -122.1169)]
merged
Out[15]:
City State_abbv State Count Latitude Longitude
0 Anaheim CA California 1 33.8390 -117.8572
1 Andover MA Massachusetts 1 42.6554 -71.1418
2 Arlington VA Virginia 6 38.8786 -77.1011
3 Ashburn VA Virginia 1 39.0300 -77.4711
4 Asheville NC North Carolina 1 35.5704 -82.5536
... ... ... ... ... ... ...
135 Wellesley MA Massachusetts 6 42.3043 -71.2855
136 West Menlo Park CA California 2 37.4338 -122.2034
137 Weston MA Massachusetts 1 42.3668 -71.3031
138 Wilmington DE Delaware 1 39.7415 -75.5413
139 Woonsocket RI Rhode Island 1 42.0010 -71.4993

136 rows × 6 columns

By State

See Appendix for frequency by state.

In [16]:
#Group job applications by State, count them, reset the index, drop the date, and rename City to Count.
state_tally = merged[['Count', 'State_abbv', 'State']].groupby(['State_abbv', 'State']).sum().sort_values(by = 'Count', ascending = False, axis = 0).reset_index()

US Map

Below is an interactive map of the US. Applications are sorted by city and state.

In [17]:
#Singular or Plural.
def f(row) :
    if row['Count'] == 1 :
        string_val = ' application in '
    else :
        string_val = ' applications in '

    return string_val

#Number of applications per city as text.
merged['Text'] = merged['Count'].astype(str) + merged.apply(f, axis = 1) + merged['City'] + ', ' + merged['State_abbv'] + '.'

#Number of applications per state as text.
state_tally['Text'] = state_tally['Count'].astype(str) + state_tally.apply(f, axis = 1) + state_tally['State'] + '.'

#Color in states by number of applications.
state_map_data = go.Choropleth(locations = state_tally['State_abbv'],
                               z = state_tally['Count'],
                               text = state_tally['Text'],
                               hoverinfo = 'text',
                               locationmode = 'USA-states',
                               colorbar = {'title': "<b>Applications</b>",
                                           'thicknessmode': "pixels",
                                           'thickness': 70,
                                           'lenmode': "pixels",
                                           'len': 400,
                                           'titlefont': {'size': 16},
                                           'tickfont': {'size': 12},
                                           'tickvals': [0, 20, 40, 60, 80, 100]},
                               colorscale = 'Blues')

#Plot cities, size corresponds to number of applications.
city_map_data = go.Scattergeo(lon = merged['Longitude'],
                              lat = merged['Latitude'],
                              text = merged['Text'],
                              hoverinfo = 'text',
                              locationmode = 'USA-states',
                              marker = {'size': 10*np.sqrt(merged['Count']),
                                        'color': 'Darkgreen'})

data = [state_map_data, city_map_data]

fig = go.Figure(data = data)
fig.update_layout(title = {'text': 'Where I Have Applied (Hover and Zoom)',
                           'font': {'size': 30}},
                  geo_scope = 'usa',
                  width = 950,
                  height = 550)

Waffle Plot

Below is the distribution of applications by state.

In [18]:
#Sort by number of applications for each state.
waffle_data = state_tally[['State', 'Count']]

#Add a row for states with less than 3 applications.  'Other' also includes Australia and Japan.
waffle_data = waffle_data.append(pd.DataFrame({'State': ['Other'], 'Count': [2]})).reset_index(drop = True)

to_drop = []
#Add applications from states with less than 6 to 'Other'.
for i in waffle_data.index :
    if waffle_data.iloc[i]['Count'] < 6:
        temp = waffle_data.iloc[i]['Count']
        waffle_data.at[33, 'Count'] += temp
        to_drop.append(i)

#Remove states with less than 3.  Change orientation of data.  
waffle_data = waffle_data.drop(labels = to_drop, axis = 0).reset_index(drop = True).set_index('State').transpose()
#print(waffle_data.sum(axis = 1))
#print(waffle_data.shape)
In [19]:
fig = plt.figure(FigureClass = Waffle,
                 rows = 13,
                 values = waffle_data.values.tolist()[0],
                 labels = waffle_data.columns.tolist(),
                 figsize = (14, 9),
                 colors = bold_colors[0:20],
                 legend = {'loc': 'upper left',
                           'ncol': 2,
                           'fontsize': 13})

Appendix

Company Applications by Date

In [20]:
pd.set_option('display.max_columns', None)
display(application_df)
AETEA Information Technology ALTEN AP Professionals ARYZTA Accelere Addison Group Addison Professional Financial Search LLC Aditi Consulting Advanced Auto Parts Age of Learning Agility Partners Alexander Technology Group Amazon American Bureau of Shipping Analytic Recruiting Inc. Apex Systems Artisan Talent Ascii Group, LLC Atlas Reasearch Austin Fraser Averity BCG Digital Ventures Bayside Solutions Bear Cognition Big Cloud Blastpoint BlueAlly Services BombBomb BrandMuscle Broadridge Brooksource Burtch Works CBTS CVS / Aetna CarMax CareHarmony Caserta Caterpillar CircleUp Cisco ClearBridge Technology Group Clever Devices Cloud9 Technologies, LLC Coders Data Coit Group Collabera, Inc. Common App CompuGain CoreSite Cornerstone Staffing Solutions, Inc. Coursera Crawford Thomas Recruiting Cresta Critical Mass Curate Partners CybeCys, Inc. CyberCoders DMI Dahl Consulting DataLab USA Delphi-US, LLC Demyst Dick's Sporting Goods Diversant, LLC Dstillery EPITEC EZCORP Edison Eliassen Group Enhance IT Entech Entelligence Envision EpicTec Evernote Expedia FICO Fandango FedEx Fladger Associates FleetCor Technologies, Inc. Flexton Flywheel Digital ForgeRock Forrest Solutions Further Enterprise Solutions Gambit Technologies Gap Inc. GitHub Good Apple Google Gradient AI Greater New York Insurance Companies Greenphire Gsquared Group HIRECLOUT HP Harnham Havas Media Group Hays Helen of Troy Hired by Matrix, Inc Hirewell Homesnap Horizon Media Horizontal Talent Hunter International Recruiting Huxley IBM IDR IDR, Inc. IQVIA ISO Ibotta Idexcel Illumination Works Innovative Systems Group Intelliswift Software, Inc. International Consulting Associates, Inc. JM Eagle JPI Jefferson Frank Jobot Jobspring Partners KGS Technology Group, Inc Kairos Living Kenshoo Knowable Komodo Health Kvaliro LevelUP LexisNexis Liberty Personnel Services LinkedIn LivePerson LockerDome M Science MITRE Macro Eyes Magnifi MaxisIT, Inc. McKinley Marketing Partners Media Assembly Meredith Corporation Mesh Recruiting, LLC Microsoft MindPool, Inc. Modis Moodys Northwest Consulting Motion Recruitment MotiveMetrics Mount Indie Next Insurance Ntelicor Nvidia OkCupid Olive Onebridge OpenArc, LLC. Optello Optomi PRI Technology Parker and Lynch Paro.io Patel Consulatants Pathrise PayPal Pilot Flying J Planet Pharma Possible Prime Team Partners Proclinical Ltd. Prodigy Education Puls Pyramid Consulting, Inc. Quadrant Resource R9 Digital Radiansys, Inc. Randstad Real Rent-A-Center Reply Retail Solutions Inc. Roblox SBS Creatix, LLC Sand Cherry Associates Sanjay Bharti, MD PLLC Scale Media Scion Staffing Seagate Technology Selling Simplified Servpro Industries, LLC ShootProof SkyWater Search Partners Softworld Sogeti Sonder Inc. Susquehanna International Group, LLP Swift Strategic Solutions Inc. Synectics Inc. Synergis Systecon North America TRC Staffing Services, Inc. Tech Observer TechWorkers Technology Ventures Tencent The AI Institute The Equus Group The Home Depot The Jacobson Group The Judge Group The Lab Consulting The Phoenix Group Tiger Analytics Time Topco Associates LLC Toyoda Gosei Americas Ursus, Inc. Valassis Vans Via Visible Walgreens White Ops Wimmer Solutions Wind River Wish Wonderlic WorldLink US Yoh, A Day & Zimmermann Company Yoh, A Day & Zimmermann Company, LLC s.com zyBooks
2020-08-11 0 0 0 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2020-08-12 0 0 0 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 11 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2020-08-13 0 0 0 0 0 0 0 0 0 0 0 0 13 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
2020-08-14 0 0 0 0 0 0 0 0 0 0 0 0 13 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
2020-08-15 0 0 0 0 0 0 0 0 0 0 0 0 13 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 12 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2020-10-30 2 1 1 1 4 1 1 1 1 1 1 1 29 1 1 12 1 1 1 3 3 1 1 1 1 1 1 1 1 1 5 3 1 13 1 1 1 6 1 4 1 1 1 1 1 1 1 1 1 1 1 3 1 1 2 1 9 2 1 1 2 1 5 6 1 1 1 1 13 1 1 1 1 1 8 3 1 1 1 1 5 1 1 1 1 1 3 6 10 1 19 1 1 1 1 2 4 5 2 1 1 1 1 1 1 2 1 2 2 3 1 1 1 3 1 1 1 2 1 1 2 1 8 4 2 1 1 1 1 1 1 1 1 1 3 1 2 21 1 1 1 1 2 1 1 7 1 3 1 1 1 1 3 1 1 1 1 1 2 40 1 1 2 1 1 6 22 1 1 1 1 1 1 1 2 1 1 1 3 1 1 1 1 2 1 3 1 1 6 6 1 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 1 2 7 1 1 1 1 1 1 1 1 3 1 1 1 1 1 2 1 1 6 1 1 1 3 1 1
2020-10-31 2 1 1 1 4 1 1 1 1 1 1 1 29 1 1 12 1 1 1 3 3 1 1 1 1 1 1 1 1 1 5 3 1 13 1 1 1 6 1 4 1 1 1 1 1 1 1 1 1 1 1 3 1 1 2 1 9 2 1 1 2 1 5 6 1 1 1 1 13 1 1 1 1 1 8 3 1 1 1 1 5 1 1 1 1 1 3 6 10 1 19 1 1 1 1 2 4 5 2 1 1 1 1 1 1 2 1 2 2 3 1 1 1 3 1 1 1 2 1 1 2 1 8 4 2 1 1 1 1 1 1 1 1 1 3 1 2 21 1 1 1 1 2 1 1 7 1 3 1 1 1 1 3 1 1 1 1 1 2 40 1 1 2 1 1 6 22 1 1 1 1 1 1 1 2 1 1 1 3 1 1 1 1 2 1 3 1 1 6 6 1 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 1 2 7 1 1 1 1 1 1 1 1 3 1 1 1 1 1 2 1 1 6 1 1 1 3 1 1
2020-11-01 2 1 1 1 4 1 1 1 1 1 1 1 29 1 1 12 1 1 1 3 3 1 1 1 1 1 1 1 1 1 5 3 1 13 1 1 1 6 1 4 1 1 1 1 1 1 1 1 1 1 1 3 1 1 2 1 9 2 1 1 2 1 5 6 1 1 1 1 13 1 1 1 1 1 8 3 1 1 1 1 5 1 1 1 1 1 3 6 10 1 19 1 1 1 1 2 4 5 2 1 1 1 1 1 1 2 1 2 2 3 1 1 1 3 1 1 1 2 1 1 2 1 8 4 2 1 1 1 1 1 1 1 1 1 3 1 2 21 1 1 1 1 2 1 1 7 1 3 1 1 1 1 3 1 1 1 1 1 2 40 1 1 2 1 1 6 22 1 1 1 1 1 1 1 2 1 1 1 3 1 1 1 1 4 1 3 1 1 6 6 1 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 1 2 7 1 1 1 1 1 1 1 1 3 1 1 1 1 1 2 1 1 6 1 1 1 3 1 1
2020-11-02 2 1 1 1 4 1 1 1 1 1 1 1 29 1 1 12 1 1 1 3 3 1 1 1 1 1 1 1 1 1 5 3 1 13 1 1 1 6 1 4 1 1 1 1 1 1 1 1 1 1 1 3 1 1 2 1 9 2 1 1 2 1 5 6 1 1 1 1 13 1 1 1 1 1 8 3 1 1 1 1 5 1 1 1 1 1 3 6 10 1 19 1 1 1 1 2 4 5 2 1 1 1 1 1 1 2 1 2 2 3 1 1 1 3 1 1 1 2 1 1 2 1 8 4 2 1 1 1 1 1 1 1 1 1 3 1 2 21 1 1 1 1 2 1 1 7 1 3 1 1 1 1 3 1 1 1 1 1 2 40 1 1 2 1 1 6 22 1 1 1 1 1 1 1 2 1 1 1 3 1 1 1 1 4 1 3 1 1 6 6 1 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 1 2 7 1 1 1 1 1 1 1 1 3 1 1 1 1 1 2 1 1 6 1 1 1 3 1 1
2020-11-03 2 1 1 1 4 1 1 1 1 1 1 1 29 1 1 12 1 1 1 3 3 1 1 1 1 1 1 1 1 1 5 3 1 13 1 1 1 6 1 4 1 1 1 1 1 1 1 1 1 1 1 3 1 1 2 1 9 2 1 1 2 1 5 6 1 1 1 1 13 1 1 1 1 1 8 3 1 1 1 1 5 1 1 1 1 1 3 6 10 1 19 1 1 1 1 2 4 5 2 1 1 1 1 1 1 2 1 2 2 3 1 1 1 3 1 1 1 2 1 1 2 1 8 4 2 1 1 1 1 1 1 1 1 1 3 1 2 21 1 1 1 1 2 1 1 7 1 3 1 1 1 1 3 1 1 1 1 1 2 40 1 1 2 1 1 6 22 1 1 1 1 1 1 1 2 1 1 1 3 1 1 1 1 4 1 3 1 1 6 6 1 1 1 1 2 2 1 1 1 2 1 1 1 1 1 1 1 1 2 7 1 1 1 1 1 1 1 1 3 1 1 1 1 1 2 1 1 6 1 1 1 3 1 1

85 rows × 234 columns

State Frequency

In [21]:
state_tally
Out[21]:
State_abbv State Count Text
0 CA California 127 127 applications in California.
1 NY New York 51 51 applications in New York.
2 WA Washington 48 48 applications in Washington.
3 TX Texas 43 43 applications in Texas.
4 VA Virginia 35 35 applications in Virginia.
5 IL Illinois 32 32 applications in Illinois.
6 PA Pennsylvania 30 30 applications in Pennsylvania.
7 MA Massachusetts 25 25 applications in Massachusetts.
8 NC North Carolina 23 23 applications in North Carolina.
9 GA Georgia 19 19 applications in Georgia.
10 DC District Of Columbia 17 17 applications in District Of Columbia.
11 MD Maryland 16 16 applications in Maryland.
12 CO Colorado 15 15 applications in Colorado.
13 OH Ohio 9 9 applications in Ohio.
14 MN Minnesota 9 9 applications in Minnesota.
15 NJ New Jersey 8 8 applications in New Jersey.
16 AZ Arizona 8 8 applications in Arizona.
17 CT Connecticut 7 7 applications in Connecticut.
18 MI Michigan 6 6 applications in Michigan.
19 OR Oregon 3 3 applications in Oregon.
20 KY Kentucky 3 3 applications in Kentucky.
21 MO Missouri 3 3 applications in Missouri.
22 TN Tennessee 3 3 applications in Tennessee.
23 IN Indiana 3 3 applications in Indiana.
24 RI Rhode Island 2 2 applications in Rhode Island.
25 SC South Carolina 2 2 applications in South Carolina.
26 UT Utah 1 1 application in Utah.
27 AR Arkansas 1 1 application in Arkansas.
28 KS Kansas 1 1 application in Kansas.
29 ID Idaho 1 1 application in Idaho.
30 FL Florida 1 1 application in Florida.
31 DE Delaware 1 1 application in Delaware.
32 WV West Virginia 1 1 application in West Virginia.